Skip to content

Open-source readiness: engine bootstrap, branding, release infra#3

Merged
anandgupta42 merged 9 commits intomainfrom
open-source-readiness
Mar 1, 2026
Merged

Open-source readiness: engine bootstrap, branding, release infra#3
anandgupta42 merged 9 commits intomainfrom
open-source-readiness

Conversation

@anandgupta42
Copy link
Copy Markdown
Contributor

@anandgupta42 anandgupta42 commented Mar 1, 2026

Summary

Prepares the monorepo for open-source release at github.com/AltimateAI/altimate-code. This PR covers four areas:

  • Engine bootstrap — uv-managed Python engine isolation (bridge/engine.ts). The CLI auto-downloads uv, creates an isolated Python 3.12 venv at ~/.local/share/altimate-code/engine/, and installs altimate-engine with zero system dependencies required. Adds altimate-code engine status/reset/path subcommand.
  • Branding cleanup — Renames all opencode/anomalyco/OPENCODE_* references to altimate-code/AltimateAI/ALTIMATE_CLI_* across 100+ files (source, build scripts, Docker, Homebrew, AUR, SDK exports, tests, theme URLs). Fixes a critical bug where build defines (OPENCODE_VERSION) didn't match consumer declarations (ALTIMATE_CLI_VERSION), causing version detection to fall back to "local" in production builds.
  • Release infrastructure — Unified release workflow (v* tag → npm + PyPI + GitHub Release), engine-only publish workflow (engine-v* → PyPI), version sync script (bump-version.ts), and RELEASING.md guide.
  • Open-source files — LICENSE (MIT), README, CONTRIBUTING, CODE_OF_CONDUCT, CHANGELOG, SECURITY, GitHub issue/PR templates, dependabot config, CI workflow.

Files changed: 120 (101 modified + 18 new + 1 snapshot)

Key new files

File Purpose
packages/altimate-code/src/bridge/engine.ts uv-managed engine lifecycle (download, venv, install, upgrade)
packages/altimate-code/src/cli/cmd/engine.ts altimate-code engine CLI subcommand
packages/altimate-code/script/bump-version.ts Version sync between pyproject.toml and init.py
.github/workflows/release.yml Unified release: npm + PyPI + GitHub Release
.github/workflows/ci.yml CI: TypeScript (Bun) + Python (3.10/3.11/3.12 matrix)
.github/workflows/publish-engine.yml Engine-only PyPI releases
RELEASING.md Complete release process documentation

Remaining opencode references (all legitimate)

  • @gitlab/opencode-gitlab-auth — third-party npm package
  • providerID: "opencode" in test fixture — external OpenCode Zen AI provider
  • sst/opencode in test — generic GitHub URL parser test data
  • models-api.json — external provider definition from models.dev

Before publishing checklist

These steps must be completed before the first release:

1. npm token

  • Create an npm access token with publish permissions
  • Add it as NPM_TOKEN in GitHub repository secrets (Settings > Secrets and variables > Actions)

2. PyPI trusted publishing (OIDC — no API tokens needed)

  1. Go to https://pypi.org/manage/account/publishing/ (logged in as <pypi-account>)
  2. Add a new pending publisher:
    • Package name: altimate-engine
    • Owner: AltimateAI
    • Repository: altimate-code
    • Workflow: release.yml
    • Environment: pypi
  3. Create a pypi environment in GitHub repo settings (Settings > Environments > New environment > pypi)

3. Homebrew tap

  • Create AltimateAI/homebrew-tap repository on GitHub
  • Ensure the GITHUB_TOKEN used in Actions has write access to this repo (or add a PAT as HOMEBREW_TAP_TOKEN)

4. First release

# Bump engine version
bun run packages/altimate-code/script/bump-version.ts --engine 0.1.0

# Commit and tag
git add -A
git commit -m "release: v0.1.0"
git tag v0.1.0
git push origin main --tags

5. Optional: AUR (Linux)

  • Register altimate-code-bin package on AUR
  • Set up SSH key for AUR push access in CI secrets

6. Verify after first release

npm info altimate-code-ai version          # npm
pip install altimate-engine==0.1.0         # PyPI
brew update && brew info altimate/tap/altimate-code  # Homebrew
docker pull ghcr.io/<org>/altimate-code:<version>   # Docker

Test plan

  • Verify bun run packages/altimate-code/script/build.ts --single succeeds and ALTIMATE_ENGINE_VERSION is injected
  • Verify cd packages/altimate-engine && pytest passes (283+ tests)
  • Verify cd packages/altimate-engine && python -m build --sdist produces valid package
  • Verify CI workflow runs on this PR
  • Verify no OPENCODE_ or anomalyco references remain: grep -rn "OPENCODE_\|anomalyco" --include="*.ts" --include="*.py" --include="*.yml"

🤖 Generated with Claude Code

…ease infrastructure

- Add uv-managed Python engine bootstrap (bridge/engine.ts) — zero system dependencies
- Add `altimate-code engine status/reset/path` CLI subcommand
- Add unified release workflow (npm + PyPI + GitHub Release on v* tags)
- Add CI workflow (TypeScript + Python 3.10/3.11/3.12 matrix)
- Add engine-only publish workflow (engine-v* tags → PyPI)
- Add version sync script (bump-version.ts)
- Add open-source files: LICENSE (MIT), README, CONTRIBUTING, CODE_OF_CONDUCT, CHANGELOG, SECURITY
- Add GitHub templates: bug report, feature request, PR template, dependabot
- Fix build defines: OPENCODE_* → ALTIMATE_CLI_* (was breaking version detection)
- Fix binary/package names: opencode → altimate-code throughout
- Fix Docker/Homebrew/AUR: anomalyco/opencode → AltimateAI/altimate-code
- Rename SDK exports: OpencodeClient → AltimateClient, createOpencodeClient → createAltimateClient
- Update theme $schema URLs: opencode.ai → altimate-code.sh
- Update pyproject.toml with full PyPI metadata
- Update ping protocol to return engine version
- Strip v prefix from ALTIMATE_CLI_VERSION in Script module
- Update 30+ test files: env vars, paths, config names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@anandgupta42 anandgupta42 requested a review from kulvirgit March 1, 2026 08:13
anandgupta42 and others added 3 commits March 1, 2026 00:26
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ry test

- Agent config keys in tests: build → builder (matches actual agent name)
- Read test context: agent "build" → "builder"
- defaultAgent tests: disable all primary agents (builder, analyst, validator, migrator, plan)
- Registry test: handle cowsay import failure gracefully in sandboxed envs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep both our dev dependencies and main's new security/docker/tunneling
optional dependency groups.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines +59 to +62
// 4. Production: uv-managed engine
const { ensureEngine, enginePythonPath } = await import("./engine")
await ensureEngine()
return enginePythonPath()

This comment was marked as outdated.

anandgupta42 and others added 2 commits March 1, 2026 00:53
- Add duckdb to dev dependencies so CI installs it (fixes 19 test failures)
- Add pytest skip markers for tests requiring sqlguard (proprietary Rust
  extension not available in CI) and boto3 (optional warehouse dependency)
- Wrap ensureEngine() in try-catch in resolvePython() to show a clear error
  message instead of crashing the CLI on bootstrap failure (Sentry review)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-fixed 18 unused import and redefinition warnings in server.py,
cache.py, and feedback_store.py. Added noqa for keyring availability
check import in credential_store.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment on lines 74 to 76
child = spawn(pythonCmd, ["-m", "altimate_engine.server"], {
stdio: ["pipe", "pipe", "pipe"],
})

This comment was marked as outdated.

anandgupta42 and others added 3 commits March 1, 2026 00:59
Add an `error` event handler on the child process so the CLI doesn't
crash with an unhandled exception if the Python executable can't be
found (e.g., corrupted venv). Flagged by Sentry review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
engine.ts:
- Add promise-based mutex to prevent concurrent ensureEngine() calls
  from corrupting state (race condition on venv/manifest)

client.ts:
- Add re-entrancy guard on start() to prevent recursive start->ping->
  call->start loops when the Python process dies immediately
- Clear setTimeout on successful response to prevent timer accumulation
  over long sessions (thousands of leaked timer callbacks)
- Reset restartCount on successful start so transient failures don't
  permanently disable the bridge
- Add error handler on child.stdin to prevent unhandled exception when
  writing to a closed pipe

publish.ts:
- Pass GitHub token via GIT_CONFIG env vars instead of embedding it in
  the git clone URL, which leaked it in process args and .git/config

release.yml:
- Validate tag format with strict semver regex before use in shell
- Quote all variable expansions to prevent shell injection via crafted
  tag names (e.g., v$(malicious-command))
- Pass github.ref_name via env var instead of inline ${{ }} expansion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Docker distribution is not needed. The unconditional docker buildx call
would have blocked the AUR/Homebrew updates that follow it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@anandgupta42 anandgupta42 merged commit f693149 into main Mar 1, 2026
4 checks passed
@anandgupta42 anandgupta42 deleted the open-source-readiness branch March 2, 2026 05:13
anandgupta42 added a commit that referenced this pull request Mar 22, 2026
- Track loops by `(tool, inputHash)` not just tool name (#2)
- Use "Failed after" narrative for error traces (#3)
- Add keyboard accessibility to viewer tabs (role, tabindex, Enter/Space) (#4)
- Use full command as dedup key, not `slice(0,60)` (#5)
- Sort timeline events by time before rendering (#6)
- Pass `tracesDir` to footer text in `listRecaps` (#7)
- Increase `MAX_RECAPS` to 100, add eviction warning log (#8)
- Resolve assistant `parentID` for recap enrichment (#9)
- Remove unused `tracer` variable in test (#10)
- Clarify `--no-trace` backward-compat flag in docs (#1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
anandgupta42 added a commit that referenced this pull request Mar 23, 2026
…381)

* feat: rename tracer to recap with loop detection, post-session summary, and enhanced viewer

- Rename `Tracer` class to `Recap` with backward-compat aliases
- Rename CLI command `trace` to `recap` (hidden `trace` alias preserved)
- Add loop detection: flags repeated tool calls with same input (3+ in last 10)
- Add post-session summary: `narrative`, `topTools`, `loops` in trace output
- New Summary tab (default) in HTML viewer with:
  - Truncated prompt with expand toggle
  - Files changed with SQL diff previews
  - Tool-agnostic outcome extraction (dbt, pytest, Airflow, pip, SQL)
  - Deduped dbt commands with pass/fail status, clickable to waterfall
  - Smart command grouping (boring ls/cd collapsed, meaningful shown)
  - Error details with resolution tracking
  - Cost breakdown in collapsible section
- Virality: Share Recap (self-contained HTML download), Copy Summary (markdown),
  Copy Link, branded footer
- Fix XSS: timeline items escaped with `e()`
- Fix memory leak: per-session `sessionUserMsgIds` with cleanup on eviction
- Fix JS syntax: onclick quote escaping in collapsible section
- Bound `toolCallHistory` to prevent unbounded growth (cap at 200)
- Summary view wrapped in try-catch for visible error messages
- Update all 13 test files for rename + 8 new adversarial viewer tests
- Update docs: `tracing.md` → `recap.md`, CLI/TUI references updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: share/copy buttons scoping bug + `t.text` undefined + adversarial viewer tests

- Fix critical bug: Share Recap and Copy Summary buttons referenced variables
  from Summary IIFE scope — rewrote `buildMarkdownSummary` to be self-contained
- Fix `t.text` → `t.result` in narrative (was rendering "undefined")
- Fix `sessionUserMsgIds` not cleaned on MAX_RECAPS eviction (memory leak)
- Fix zero cost display: show `$0.00` instead of em-dash
- Add try-catch error boundary around Summary view rendering
- Add 8 adversarial viewer tests: XSS, NaN/Infinity, null metadata,
  200+ spans, JS syntax validation, tool-agnostic outcomes, backward compat

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address all 10 CodeRabbit review comments

- Track loops by `(tool, inputHash)` not just tool name (#2)
- Use "Failed after" narrative for error traces (#3)
- Add keyboard accessibility to viewer tabs (role, tabindex, Enter/Space) (#4)
- Use full command as dedup key, not `slice(0,60)` (#5)
- Sort timeline events by time before rendering (#6)
- Pass `tracesDir` to footer text in `listRecaps` (#7)
- Increase `MAX_RECAPS` to 100, add eviction warning log (#8)
- Resolve assistant `parentID` for recap enrichment (#9)
- Remove unused `tracer` variable in test (#10)
- Clarify `--no-trace` backward-compat flag in docs (#1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add screenshots and update recap viewer documentation

- Add Summary tab and full-page screenshots to docs
- Update viewer section with 5-tab description
- Detail what Summary tab shows: files changed, outcomes, timeline, cost
- Add screenshot at top of recap.md for quick visual reference

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: move Recap to Use section, Telemetry to Reference

- Move Recap from Configure > Observability to Use (peer to Commands, Skills)
- Move Telemetry from Configure > Observability to Reference (internal analytics)
- Remove the Observability section entirely

Recap is a feature users interact with after sessions, not a config setting.
Telemetry is internal product analytics, not user-facing observability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: viewer UX improvements from 100-trace analysis

- Collapse Files Changed after 5 entries with "Show all N files" toggle
- Rename "GENS" → "LLM Calls" in header cards
- Hide Tokens card when cost is $0 (not actionable without cost context)
- Hide Cost metric card when $0.00 (wasted space)
- Add prominent error summary banner right after header metrics
- Improved dbt outcome detection: catch [PASS], [ERROR], N of M, Compilation Error
- Outcome detection rate improved from 18% → 33% across 100 real traces
- Updated doc screenshots with cleaner samples

Tested across 100 real production traces: 0 crashes, 0 JS errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: always show Cost and Tokens cards

$0.00 is a valid cost (Anthropic Max plan). Hiding it implies
we don't support cost tracking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: tool-agnostic outcome extraction for schema, validation, SQL, lineage tools

500-trace analysis revealed:
- Schema tasks: 0% outcome visibility → 100%
- Validation tasks: 0% outcome visibility → 100%
- SQL tasks: 55% outcome visibility → 100%

Added outcome extraction for:
- schema_inspect, lineage_check, altimate_core_validate results
- SQL error messages (not just row counts)
- Improved empty session display (shows prompt if available)

Tested across 500 diverse synthetic traces (SQL, Airflow, Dagster,
Python, schema, validation, migration, connectors) + 100 real traces.
0 crashes, 0 JS errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address 4 new CodeRabbit review comments

- Add `inputHash` to `TraceFile.summary.loops` schema type (#11)
- Replace `startTrace()` API name with plain language in docs (#12)
- Use `CSS.escape()` for spanId in querySelector to handle special chars (#13)
- Sort spans by startTime before searching for error resolution (#14)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: round 3 review — sort spans once, clean narrative for 0 LLM calls

- Sort spans once before error resolution loop instead of per-error (perf)
- Narrative omits "Made 0 LLM calls" for tool-only sessions (UX)
- Updated tests to match new narrative format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add missing `altimate_change` markers for recap rename in upstream-shared files

Wrap renamed code (Tracer→Recap, trace→recap) with markers so the
Marker Guard CI check passes. The diff-based checker uses -U5 context
windows per hunk — markers must be close enough to added lines to
appear within each hunk's context.

Files fixed:
- `trace.ts` — handler body, option descriptions, viewer message, compat alias
- `app.tsx` — recapViewerServer return, openRecapInBrowser function
- `dialog-trace-list.tsx` — error title, Recaps title, compat alias
- `worker.ts` — getOrCreateRecap, part events, session title/finalization
- `index.ts` — .command(RecapCommand) registration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add altimate_change markers to all upstream-shared files

Marker Guard CI was failing — 5 upstream-shared files had custom
code (recap rename) without altimate_change markers.

Fixed: trace.ts, app.tsx, dialog-trace-list.tsx, worker.ts, index.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: type errors in training-import.test.ts from main merge

Pre-existing type issues from main: mock missing `context`/`rule`
fields and readFile return type mismatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
kulvirgit pushed a commit that referenced this pull request Mar 30, 2026
…381)

* feat: rename tracer to recap with loop detection, post-session summary, and enhanced viewer

- Rename `Tracer` class to `Recap` with backward-compat aliases
- Rename CLI command `trace` to `recap` (hidden `trace` alias preserved)
- Add loop detection: flags repeated tool calls with same input (3+ in last 10)
- Add post-session summary: `narrative`, `topTools`, `loops` in trace output
- New Summary tab (default) in HTML viewer with:
  - Truncated prompt with expand toggle
  - Files changed with SQL diff previews
  - Tool-agnostic outcome extraction (dbt, pytest, Airflow, pip, SQL)
  - Deduped dbt commands with pass/fail status, clickable to waterfall
  - Smart command grouping (boring ls/cd collapsed, meaningful shown)
  - Error details with resolution tracking
  - Cost breakdown in collapsible section
- Virality: Share Recap (self-contained HTML download), Copy Summary (markdown),
  Copy Link, branded footer
- Fix XSS: timeline items escaped with `e()`
- Fix memory leak: per-session `sessionUserMsgIds` with cleanup on eviction
- Fix JS syntax: onclick quote escaping in collapsible section
- Bound `toolCallHistory` to prevent unbounded growth (cap at 200)
- Summary view wrapped in try-catch for visible error messages
- Update all 13 test files for rename + 8 new adversarial viewer tests
- Update docs: `tracing.md` → `recap.md`, CLI/TUI references updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: share/copy buttons scoping bug + `t.text` undefined + adversarial viewer tests

- Fix critical bug: Share Recap and Copy Summary buttons referenced variables
  from Summary IIFE scope — rewrote `buildMarkdownSummary` to be self-contained
- Fix `t.text` → `t.result` in narrative (was rendering "undefined")
- Fix `sessionUserMsgIds` not cleaned on MAX_RECAPS eviction (memory leak)
- Fix zero cost display: show `$0.00` instead of em-dash
- Add try-catch error boundary around Summary view rendering
- Add 8 adversarial viewer tests: XSS, NaN/Infinity, null metadata,
  200+ spans, JS syntax validation, tool-agnostic outcomes, backward compat

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address all 10 CodeRabbit review comments

- Track loops by `(tool, inputHash)` not just tool name (#2)
- Use "Failed after" narrative for error traces (#3)
- Add keyboard accessibility to viewer tabs (role, tabindex, Enter/Space) (#4)
- Use full command as dedup key, not `slice(0,60)` (#5)
- Sort timeline events by time before rendering (#6)
- Pass `tracesDir` to footer text in `listRecaps` (#7)
- Increase `MAX_RECAPS` to 100, add eviction warning log (#8)
- Resolve assistant `parentID` for recap enrichment (#9)
- Remove unused `tracer` variable in test (#10)
- Clarify `--no-trace` backward-compat flag in docs (#1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add screenshots and update recap viewer documentation

- Add Summary tab and full-page screenshots to docs
- Update viewer section with 5-tab description
- Detail what Summary tab shows: files changed, outcomes, timeline, cost
- Add screenshot at top of recap.md for quick visual reference

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: move Recap to Use section, Telemetry to Reference

- Move Recap from Configure > Observability to Use (peer to Commands, Skills)
- Move Telemetry from Configure > Observability to Reference (internal analytics)
- Remove the Observability section entirely

Recap is a feature users interact with after sessions, not a config setting.
Telemetry is internal product analytics, not user-facing observability.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: viewer UX improvements from 100-trace analysis

- Collapse Files Changed after 5 entries with "Show all N files" toggle
- Rename "GENS" → "LLM Calls" in header cards
- Hide Tokens card when cost is $0 (not actionable without cost context)
- Hide Cost metric card when $0.00 (wasted space)
- Add prominent error summary banner right after header metrics
- Improved dbt outcome detection: catch [PASS], [ERROR], N of M, Compilation Error
- Outcome detection rate improved from 18% → 33% across 100 real traces
- Updated doc screenshots with cleaner samples

Tested across 100 real production traces: 0 crashes, 0 JS errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: always show Cost and Tokens cards

$0.00 is a valid cost (Anthropic Max plan). Hiding it implies
we don't support cost tracking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: tool-agnostic outcome extraction for schema, validation, SQL, lineage tools

500-trace analysis revealed:
- Schema tasks: 0% outcome visibility → 100%
- Validation tasks: 0% outcome visibility → 100%
- SQL tasks: 55% outcome visibility → 100%

Added outcome extraction for:
- schema_inspect, lineage_check, altimate_core_validate results
- SQL error messages (not just row counts)
- Improved empty session display (shows prompt if available)

Tested across 500 diverse synthetic traces (SQL, Airflow, Dagster,
Python, schema, validation, migration, connectors) + 100 real traces.
0 crashes, 0 JS errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address 4 new CodeRabbit review comments

- Add `inputHash` to `TraceFile.summary.loops` schema type (#11)
- Replace `startTrace()` API name with plain language in docs (#12)
- Use `CSS.escape()` for spanId in querySelector to handle special chars (#13)
- Sort spans by startTime before searching for error resolution (#14)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: round 3 review — sort spans once, clean narrative for 0 LLM calls

- Sort spans once before error resolution loop instead of per-error (perf)
- Narrative omits "Made 0 LLM calls" for tool-only sessions (UX)
- Updated tests to match new narrative format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add missing `altimate_change` markers for recap rename in upstream-shared files

Wrap renamed code (Tracer→Recap, trace→recap) with markers so the
Marker Guard CI check passes. The diff-based checker uses -U5 context
windows per hunk — markers must be close enough to added lines to
appear within each hunk's context.

Files fixed:
- `trace.ts` — handler body, option descriptions, viewer message, compat alias
- `app.tsx` — recapViewerServer return, openRecapInBrowser function
- `dialog-trace-list.tsx` — error title, Recaps title, compat alias
- `worker.ts` — getOrCreateRecap, part events, session title/finalization
- `index.ts` — .command(RecapCommand) registration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add altimate_change markers to all upstream-shared files

Marker Guard CI was failing — 5 upstream-shared files had custom
code (recap rename) without altimate_change markers.

Fixed: trace.ts, app.tsx, dialog-trace-list.tsx, worker.ts, index.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: type errors in training-import.test.ts from main merge

Pre-existing type issues from main: mock missing `context`/`rule`
fields and readFile return type mismatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
suryaiyer95 added a commit that referenced this pull request Apr 17, 2026
Fixes five correctness, reliability, and portability issues surfaced by the
consensus code review of this branch.

CRITICAL #1 — Cross-dialect partitioned diff (`data-diff.ts`):
`runPartitionedDiff` built one partition WHERE clause with `sourceDialect`
and passed it as shared `where_clause` to the recursive `runDataDiff`, which
applied it to both warehouses identically. Cross-dialect partition mode
(MSSQL → Postgres) failed because the target received T-SQL
`DATETRUNC`/`CONVERT(DATE, …, 23)`. Now builds per-side WHERE using each
warehouse's dialect and bakes it into dialect-quoted subquery SQL for source
and target independently. The existing side-aware CTE injection handles the
rest.

MAJOR #2 — Azure AD token caching and refresh (`sqlserver.ts`):
`acquireAzureToken` fetched a fresh token on every `connect()` and embedded
it in the pool config with no refresh. Long-lived sessions silently failed
when the ~1h token expired. Adds a module-scoped cache keyed by
`(resource, client_id)` with proactive refresh 5 min before expiry, parsing
`expiresOnTimestamp` from `@azure/identity` or the JWT `exp` claim from the
`az` CLI fallback. Exposes `_resetTokenCacheForTests` for isolation.

MAJOR #3 — `joindiff` + cross-warehouse guard (`data-diff.ts`):
Explicit `algorithm: "joindiff"` combined with different warehouses produced
broken SQL (one task referencing two CTE aliases with only one injected).
Now returns an early error with a clear message steering users to `hashdiff`
or `auto`. Cross-warehouse detection switched from warehouse-name string
compare to dialect compare, matching the underlying SQL-divergence invariant.

MAJOR #4 — Dialect-aware identifier quoting in CTE wrapping (`data-diff.ts`):
`resolveTableSources` wrapped plain-table names with ANSI double-quotes for
all dialects. T-SQL/Fabric require `QUOTED_IDENTIFIER ON` for this to work;
default for `mssql`/tedious is ON, but user contexts (stored procs, legacy
collations) can override. Now accepts source/target dialect parameters and
delegates to `quoteIdentForDialect`, which was hoisted to module scope so
it can be reused across partition and CTE paths.

MAJOR #5 — Configurable Azure resource URL (`sqlserver.ts`, `normalize.ts`):
Token acquisition hardcoded `https://database.windows.net/`, blocking Azure
Government, Azure China, and sovereign-cloud customers. Now honours an
explicit `azure_resource_url` config field and otherwise infers the URL
from the host suffix (`.usgovcloudapi.net`, `.chinacloudapi.cn`). Adds the
usual camelCase/snake_case aliases in the SQL Server normalizer.

Also surfaces Azure auth error causes: if both `@azure/identity` and `az`
CLI fail, the thrown error includes both hints (redacted) so users know why
rather than seeing the generic "install @azure/identity or run az login"
message.

Tests: adds `data-diff-cross-dialect.test.ts` covering the cross-dialect
partition WHERE routing and the `joindiff` guard; extends
`data-diff-cte.test.ts` with dialect-aware quoting assertions for tsql,
fabric, and mysql; extends `sqlserver-unit.test.ts` with cache hit / expiry
refresh / client-id keyed cache tests, commercial/gov/china/custom resource
URL resolution, and the combined-error-hints surface.

All 41 sqlserver driver tests, 24 data-diff orchestrator tests, and 214
normalize/connections tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
anandgupta42 pushed a commit that referenced this pull request Apr 21, 2026
* fix: use synchronous DuckDB constructor to avoid bun runtime timeout

Bun's runtime never fires native addon async callbacks, so the async
`new duckdb.Database(path, opts, callback)` form would hit the 2-second
timeout fallback on every connection attempt.

Switch to the synchronous constructor form `new duckdb.Database(path)` /
`new duckdb.Database(path, opts)` which throws on error and completes
immediately in both Node and bun runtimes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* revert: restore async DuckDB constructor — sync change was bogus

The async callback form with 2s fallback was already working correctly
at e3df5a4. The timeout was caused by a missing duckdb .node binary,
not a bun incompatibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add MSSQL/Fabric dialect mapping and data-parity support

- Add `warehouseTypeToDialect()` mapping: sqlserver→tsql, mssql→tsql,
  fabric→fabric, postgresql→postgres, mariadb→mysql. Fixes critical
  serde mismatch where Rust engine rejects raw warehouse type names.
- Update both `resolveDialect()` functions to use the mapping
- Add MSSQL/Fabric cases to `dateTruncExpr()` — DATETRUNC(DAY, col)
- Add locale-safe date literal casting via CONVERT(DATE, ..., 23)
- Register `fabric` in DRIVER_MAP (reuses sqlserver TDS driver)
- Add `fabric` normalize aliases in normalize.ts
- Add 15 SQL Server driver unit tests (TOP injection, truncation,
  schema introspection, connection lifecycle, result format)
- Add 9 dialect mapping unit tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add Azure AD authentication to SQL Server driver (7 flows)

- Support all 7 Azure AD / Entra ID auth types in `sqlserver.ts`:
  `azure-active-directory-password`, `access-token`, `service-principal-secret`,
  `msi-vm`, `msi-app-service`, `azure-active-directory-default`, `token-credential`
- Force TLS encryption for all Azure AD connections
- Dynamic import of `@azure/identity` for `DefaultAzureCredential`
- Add normalize aliases for Azure AD config fields (`authentication`,
  `azure_tenant_id`, `azure_client_id`, `azure_client_secret`, `access_token`)
- Add `fabric: SQLSERVER_ALIASES` to DRIVER_ALIASES
- Add 10 Azure AD unit tests covering all auth flows, encryption,
  and `DefaultAzureCredential` with managed identity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: add MSSQL and Microsoft Fabric documentation to data-parity SKILL.md

- Add SQL Server / Fabric schema inspection query in Step 2
- Add "SQL Server and Microsoft Fabric" section with:
  - Supported configurations table (sqlserver, mssql, fabric)
  - Fabric connection guide with Azure AD auth types
  - Algorithm behavior notes (joindiff vs hashdiff selection)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: delegate Azure AD credential creation to tedious and remove underscore column filter

- **Azure AD auth**: Pass `azure-active-directory-*` types directly to tedious
  instead of constructing `DefaultAzureCredential` ourselves. Tedious imports
  `@azure/identity` internally and creates credentials — avoids bun CJS/ESM
  `isTokenCredential` boundary issue that caused "not an instance of the token
  credential class" errors.
- **Auth shorthands**: Map `CLI`, `default`, `password`, `service-principal`,
  `msi`, `managed-identity` to their full tedious type names.
- **Column filter**: Remove `_.startsWith("_")` filter from `execute()` result
  columns — it stripped legitimate aliases like `_p` used by partition discovery,
  causing partitioned diffs to return empty results.
- **Tests**: Remove `@azure/identity` mock (no longer imported by driver),
  update auth assertions, add shorthand mapping tests, fix column filter test.
- **Verified**: All 97 driver tests pass. Full data-diff pipeline tested against
  real MSSQL server (profile, joindiff, auto, where_clause, partitioned).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: upgrade `mssql` to v12 with `ConnectionPool` isolation and row flattening

- Upgrade `mssql` from v11 to v12 (`tedious` 18 → 19)
- Use explicit `ConnectionPool` instead of global `mssql.connect()` to
  isolate multiple simultaneous connections
- Flatten unnamed column arrays — `mssql` merges unnamed columns (e.g.
  `SELECT COUNT(*), SUM(...)`) into a single array under the empty-string
  key; restore positional column values
- Proper column name resolution: compare `namedKeys.length` against
  flattened row length, fall back to synthetic `col_0`, `col_1`, etc.
- Update test mock to export `ConnectionPool` class and `createMockPool`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve TypeScript spread-type errors in Azure AD conditional options

Use ternary expressions (`x ? {...} : {}`) instead of short-circuit
(`x && {...}`) to avoid spreading a boolean value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve cubic review findings on MSSQL/Fabric PR

- P1: restrict `flattenRow` to only spread the empty-string key (`""`)
  where mssql merges unnamed columns, preserving legitimate array values
- P2: escape single quotes in `partitionValue` for date-mode branches in
  `buildPartitionWhereClause` (categorical mode already escaped)
- P2: add `fabric` to `PASSWORD_DRIVERS` set in registry for consistent
  password validation alongside `sqlserver`/`mssql`
- P2: fallback to `"(no values)"` when `d.values` is nullish to prevent
  template literal coercing `undefined` to the string `"undefined"`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add fabric connection path and flattenRow coverage

- sqlserver-unit: 3 tests for unnamed column flattening — verifies only
  the empty-string key is spread, legitimate named arrays are preserved
- driver-normalize: fabric type uses SQLSERVER_ALIASES (server → host,
  trustServerCertificate → trust_server_certificate)
- connections: fabric type is recognized in DRIVER_MAP and listed correctly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: document minimum versions and make @azure/identity optional

- Add "Minimum Version Requirements" table to SKILL.md covering SQL Server
  2022+, mssql v12, and @azure/identity v4 with rationale for each
- Document auth shorthands (CLI, default, password, service-principal, msi)
- Move @azure/identity from dependencies to optional peerDependencies so
  it is NOT installed by default — only required for Azure AD auth
- Add runtime check in sqlserver driver: if Azure AD auth type is requested
  but @azure/identity is missing, throw a clear install instruction error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: acquire Azure AD tokens directly to bypass Bun browser-bundle resolution

- For `azure-active-directory-default` (CLI/default auth), acquire token
  ourselves instead of delegating to tedious's internal `@azure/identity`
- Strategy: try `DefaultAzureCredential` first, fall back to `az` CLI subprocess
- Bypasses Bun resolving `@azure/identity` to browser bundle where
  `DefaultAzureCredential` is a non-functional stub
- Also bypasses CJS/ESM `isTokenCredential` boundary mismatch
- All 31 driver unit tests pass, verified against real Fabric endpoint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: auto-acquire Azure AD token for `azure-active-directory-access-token` when none supplied

The `azure-active-directory-access-token` branch passed
`token: config.token ?? config.access_token` to tedious. When neither
field was set on a connection (e.g. a `fabric-migration` entry that
declared the auth type but no token), tedious threw:

    TypeError: The "config.authentication.options.token" property
    must be of type string

This blocked any Fabric/MSSQL config that relied on ambient credentials
(Azure CLI / managed identity) but used the explicit
`azure-active-directory-access-token` type instead of the `default`
shorthand.

Refactor token acquisition (`DefaultAzureCredential` → `az` CLI
fallback) into a shared `acquireAzureToken()` helper used by both the
`default` path and the `access-token` path when no token was supplied.
Callers that pass an explicit token are unchanged.

Also harden `mock.module("node:child_process", ...)` in
`sqlserver-unit.test.ts` to spread the real module so sibling tests in
the same `bun test` run keep access to `spawn` / `exec` / `fork`.

Tests: 110 pass, 0 fail in `packages/drivers`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix: side-aware CTE injection for cross-warehouse `data_diff` SQL-query mode

When `source` and `target` are both SQL queries, `resolveTableSources` wraps
them as `__diff_source` / `__diff_target` CTEs and the executor prepends the
combined `WITH …` block to every engine-emitted task. T-SQL and Fabric
parse-bind every CTE body even when unreferenced, so a task routed to the
source warehouse failed to resolve the target-only base table referenced
inside the unused `__diff_target` CTE (and vice versa), producing
`Invalid object name` errors from the wrong warehouse.

Return side-specific prefixes from `resolveTableSources` alongside the
combined one, and have the executor loop in `runDataDiff` pick the source
or target prefix per task when `source_warehouse !== target_warehouse`.
Same-warehouse behaviour is unchanged.

Adds `data-diff-cte.test.ts` covering plain-name passthrough, both-query
wrapping, side-specific CTE isolation, and CTE merging with engine-emitted
`WITH` clauses (10 tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate `bun.lock` to match drivers `peerDependencies` layout

Commit 333a45c moved `@azure/identity` from `optionalDependencies` to
`peerDependencies` with `optional: true` in `packages/drivers/package.json`,
but the lockfile was not regenerated. That left CI under `--frozen-lockfile`
broken and made fresh installs silently diverge from the committed state.

Running `bun install` brings the lockfile in sync: `@azure/identity` is
recorded as an optional peer, and its transitive pins (`@azure/msal-browser`,
`@azure/msal-common`, `@azure/msal-node`) re-resolve to the versions required
by `tedious` and `snowflake-sdk`, matching the reachable runtime surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address all CRITICAL/MAJOR findings from multi-model review

Fixes five correctness, reliability, and portability issues surfaced by the
consensus code review of this branch.

CRITICAL #1 — Cross-dialect partitioned diff (`data-diff.ts`):
`runPartitionedDiff` built one partition WHERE clause with `sourceDialect`
and passed it as shared `where_clause` to the recursive `runDataDiff`, which
applied it to both warehouses identically. Cross-dialect partition mode
(MSSQL → Postgres) failed because the target received T-SQL
`DATETRUNC`/`CONVERT(DATE, …, 23)`. Now builds per-side WHERE using each
warehouse's dialect and bakes it into dialect-quoted subquery SQL for source
and target independently. The existing side-aware CTE injection handles the
rest.

MAJOR #2 — Azure AD token caching and refresh (`sqlserver.ts`):
`acquireAzureToken` fetched a fresh token on every `connect()` and embedded
it in the pool config with no refresh. Long-lived sessions silently failed
when the ~1h token expired. Adds a module-scoped cache keyed by
`(resource, client_id)` with proactive refresh 5 min before expiry, parsing
`expiresOnTimestamp` from `@azure/identity` or the JWT `exp` claim from the
`az` CLI fallback. Exposes `_resetTokenCacheForTests` for isolation.

MAJOR #3 — `joindiff` + cross-warehouse guard (`data-diff.ts`):
Explicit `algorithm: "joindiff"` combined with different warehouses produced
broken SQL (one task referencing two CTE aliases with only one injected).
Now returns an early error with a clear message steering users to `hashdiff`
or `auto`. Cross-warehouse detection switched from warehouse-name string
compare to dialect compare, matching the underlying SQL-divergence invariant.

MAJOR #4 — Dialect-aware identifier quoting in CTE wrapping (`data-diff.ts`):
`resolveTableSources` wrapped plain-table names with ANSI double-quotes for
all dialects. T-SQL/Fabric require `QUOTED_IDENTIFIER ON` for this to work;
default for `mssql`/tedious is ON, but user contexts (stored procs, legacy
collations) can override. Now accepts source/target dialect parameters and
delegates to `quoteIdentForDialect`, which was hoisted to module scope so
it can be reused across partition and CTE paths.

MAJOR #5 — Configurable Azure resource URL (`sqlserver.ts`, `normalize.ts`):
Token acquisition hardcoded `https://database.windows.net/`, blocking Azure
Government, Azure China, and sovereign-cloud customers. Now honours an
explicit `azure_resource_url` config field and otherwise infers the URL
from the host suffix (`.usgovcloudapi.net`, `.chinacloudapi.cn`). Adds the
usual camelCase/snake_case aliases in the SQL Server normalizer.

Also surfaces Azure auth error causes: if both `@azure/identity` and `az`
CLI fail, the thrown error includes both hints (redacted) so users know why
rather than seeing the generic "install @azure/identity or run az login"
message.

Tests: adds `data-diff-cross-dialect.test.ts` covering the cross-dialect
partition WHERE routing and the `joindiff` guard; extends
`data-diff-cte.test.ts` with dialect-aware quoting assertions for tsql,
fabric, and mysql; extends `sqlserver-unit.test.ts` with cache hit / expiry
refresh / client-id keyed cache tests, commercial/gov/china/custom resource
URL resolution, and the combined-error-hints surface.

All 41 sqlserver driver tests, 24 data-diff orchestrator tests, and 214
normalize/connections tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #705 bot review findings (coderabbitai + cubic + copilot)

Addresses the remaining issues raised by coderabbitai, cubic-dev-ai, and the
Copilot PR reviewer on top of the multi-model consensus fix.

### CRITICAL

- **`@azure/identity` peer dep removed** (`drivers/package.json`)
  `mssql@12` → `tedious@19` bundles `@azure/identity ^4.2.1` as a regular
  dependency. Declaring it here as an optional peer was redundant and caused
  transitive-version-drift concerns. Users get the correct version
  automatically through the tedious chain; our lazy import handles the
  browser-bundle edge case itself.

### MAJOR

- **Cross-dialect date partition literal normalization** (`data-diff.ts`)
  `buildPartitionDiscoverySQL` on MSSQL returns a JS `Date` object, stringified
  upstream as `"Mon Jan 01 2024 …"`. `CONVERT(DATE, …, 23)` rejects that format.
  Normalize `partitionValue` to ISO `yyyy-mm-dd` before dialect casting so the
  T-SQL/Fabric path works end-to-end on dates discovered from MSSQL sources.

- **`crossWarehouse` uses resolved warehouse identity** (`data-diff.ts`)
  Previous commit gated on dialect compare, which treated two independent
  MSSQL instances as "same warehouse" and would have let `joindiff` route a
  JOIN through a warehouse that can't resolve the other side's base tables.
  Now resolves both sides' warehouse name (falling back to the default
  warehouse when a side is omitted) and compares identities — identity-based
  gating handles both the "undefined vs default" case (cubic) and the
  "same-dialect, different instance" case (Copilot).

- **Drop `mssql.connect()` fallback** (`sqlserver.ts`)
  `mssql@^12` guarantees `ConnectionPool` as a named export. The fallback
  silently re-introduced the global-shared-pool bug this branch was added
  to fix. Now throws a descriptive error if `ConnectionPool` is missing —
  cross-database pool interference cannot regress.

- **Non-string `config.authentication` guarded** (`sqlserver.ts`)
  Caller passing a pre-built `{ type, options }` block (or `null`) previously
  crashed with `TypeError: rawAuth.toLowerCase is not a function`. Now only
  applies the shorthand lookup when `rawAuth` is a string; other values pass
  through so tedious can handle them or reject them with its own error.

- **Unknown `azure-active-directory-*` subtype fails fast** (`sqlserver.ts`)
  Typos or future tedious subtypes previously dropped through all `else if`
  branches, producing a config with `encrypt: true` but no `authentication`
  block. tedious then surfaced an opaque error far from the root cause.
  Now throws with the offending subtype and the supported list.

- **`execSync` replaced with async `exec`** (`sqlserver.ts`)
  The `az account get-access-token` CLI fallback previously blocked the
  event loop for up to 15s. Switched to `util.promisify(exec)` so the
  connection path stays non-blocking.

- **Mixed named + unnamed column derivation preserves headers** (`sqlserver.ts`)
  Previously `SELECT name, COUNT(*), SUM(x)` produced either `["name", ""]`
  (blank header) or `["col_0", "col_1", "col_2"]` (lost `name`). Rewrote
  column/row derivation to iterate in one pass, preserving known named
  columns and synthesizing `col_N` only for expanded `""`-key positions.

### MINOR

- **`(no values)` fallback for empty `diff_row.values` array** (`tools/data-diff.ts`)
  `[].join(" | ") ?? "(no values)"` never fires because `""` is falsy-but-not-
  nullish. Gate on `d.values?.length` instead.

### Test / docs

- `sqlserver-unit.test.ts`: token-cache client-id test now counts actual
  `getToken` invocations (previous version only verified both got the same
  mocked token, which proved nothing about keying).
- `sqlserver-unit.test.ts`: "empty result" test now mirrors the real mssql
  shape (`recordset.columns` is a property *on* the recordset array, not a
  sibling key).
- `sqlserver-unit.test.ts`: added mixed-column regression tests — "name +
  COUNT + SUM" and "single unnamed column" — to lock in the derivation fix.
- `sqlserver-unit.test.ts`: stubbed async `exec` via `util.promisify.custom`
  so tests drive both the `execSync` legacy path and the new async path.
- `SKILL.md`: Fabric config fenced block now declares `yaml` (markdownlint
  MD040).

All tests: 43/43 sqlserver driver + 238/238 opencode test suite.

Attribution: findings identified by coderabbitai, cubic-dev-ai, and the
Copilot PR reviewer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: drop stale `@azure/identity` peer-dep entries from `bun.lock`

Commit 38cfb0e removed `@azure/identity` from the drivers package's
`peerDependencies` (tedious already bundles it), but the lockfile's
`packages/drivers` workspace section still carried the corresponding
`peerDependencies` and `optionalPeers` blocks. CI running
`bun install --frozen-lockfile` would fail on the drift.

Minimal edit — just removes the two stale blocks. No resolution changes
(`bun install --frozen-lockfile` passes with "no changes").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: CI — isolate `data-diff-cross-dialect` tests from other files

The prior integration-style test mocked the Registry module globally with
`mock.module(".../registry", ...)`, which leaks across all test files in
bun:test's single-process runner. That caused 14 unrelated tests in
`connections.test.ts`, `telemetry-safety.test.ts`, and
`dbt-first-execution.test.ts` to fail in CI.

Additionally, the test relied on `mock.module("@altimateai/altimate-core")`
to supply a fake `DataParitySession`. The npm-published 0.2.6 of that
package does not export `DataParitySession` (sessions are only in the
locally-built `altimate-core-internal` binary), and Bun's `mock.module`
cannot override a package that another test file has already imported —
so the integration test was structurally unreliable.

Resolution:

1. **Export pure SQL-builder helpers** from `data-diff.ts`
   (`dateTruncExpr`, `buildPartitionWhereClause`) and unit-test them
   directly. No module mocking required; the test directly exercises the
   logic the CRITICAL/MAJOR fix changed.

2. **Move the `joindiff` + cross-warehouse guard earlier** in `runDataDiff`
   — before the NAPI import. Semantically identical for callers (guard
   still fires, same error message, `steps: 0`), but now it can be
   integration-tested without any NAPI mock. Preserves end-to-end
   wiring coverage for the guard.

3. **Rewrite `data-diff-cross-dialect.test.ts`** as pure-function unit
   tests for the partition WHERE logic + a real `runDataDiff` call for
   the joindiff guard. No more cross-file mock pollution.

Functionality unchanged:
- `runDataDiff` behavior for real callers is identical. The only
  observable difference is error-ordering: if a caller simultaneously
  omits NAPI and passes `joindiff + cross-warehouse`, they now get the
  "joindiff requires same warehouse" error instead of the NAPI-missing
  error. That's strictly better UX — NAPI availability is a deployment
  concern, `joindiff`+cross-warehouse is a user error.
- `buildPartitionWhereClause` and `dateTruncExpr` are now exported but
  semantically unchanged — same inputs, same outputs.

Test results:
- 2821 altimate tests pass, 0 fail
- 43 sqlserver driver tests pass, 0 fail
- The 19 remaining full-suite failures (`mcp/`, `tool/project-scan`,
  `plan-approval-phrase`) are pre-existing on `main` and unrelated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: follow-up PR bot review findings (cubic P1/P2 + coderabbit MAJOR/MINOR)

Addresses 5 substantive issues raised by the latest round of bot reviews.

### P1 / MAJOR

- **MySQL/MariaDB week-partition values no longer corrupted** (cubic P1,
  data-diff.ts:610) — the prior ISO `yyyy-mm-dd` normalization applied to
  every dialect silently rewrote MySQL `DATE_FORMAT(%Y-%u)` outputs like
  `"2024-42"` into invalid dates, producing WHERE clauses that never
  match. Scope the normalization to T-SQL / Fabric only — those use
  `CONVERT(DATE, …, 23)` which is the only code path that requires ISO.
  Postgres, MySQL, ClickHouse, BigQuery, Oracle all get the raw value
  verbatim, matching their own `DATE_TRUNC`/`toStartOf*` output.

- **Partitioned diff no longer drops extra_columns** (coderabbit MAJOR,
  data-diff.ts:824) — the partition fix wraps each side as a SELECT
  subquery before recursing. `discoverExtraColumns` skips SQL queries
  (only inspects plain table names), so the recursive `runDataDiff` fell
  through to key-only comparison, silently losing value-level diffs.
  Now `runPartitionedDiff` runs discovery ONCE on the plain source
  table up-front and passes the resolved `extra_columns` explicitly to
  each recursive call. Audit-column exclusion metadata is also
  propagated to the aggregated result for user reporting.

### P2 / MINOR

- **`azure_resource_url` trailing slash normalized** (cubic P2,
  sqlserver.ts:50) — an explicit `"https://custom-host"` (no slash)
  would produce an invalid OAuth scope `"https://custom-host.default"`.
  Enforce a trailing slash in `resolveAzureResourceUrl`.

- **`az account get-access-token` uses `execFile`** (coderabbit,
  sqlserver.ts:200) — replaces `exec(<shell command string>)` with
  `execFile("az", [args])` so user-supplied `azure_resource_url` can't
  introduce shell metacharacters into the command string. Also updates
  the test harness to stub both `exec` and `execFile`.

### Test isolation / coverage

- **Added same-dialect cross-warehouse joindiff test** (cubic,
  data-diff-cross-dialect.test.ts:97) — two MSSQL servers with
  different hosts must still be gated by the joindiff guard; previous
  tests only exercised mixed dialects.

- **Added MySQL week-partition regression tests** — prevent future
  revivals of the dialect-unaware ISO rewrite.

- **Added trailing-slash `azure_resource_url` test.**

Test results:
- 44/44 sqlserver driver tests pass
- 2824/2824 altimate tests pass, 0 fail
- Remaining full-suite failures (`mcp/`, `tool/project-scan`,
  `plan-approval-phrase`) are pre-existing on `main`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant